Skip to content

[AMD] Add Minimax tp8 with ep and update vllm image for MI355x#927

Merged
chunfangamd merged 9 commits intomainfrom
minimax-mi355-opt
Mar 23, 2026
Merged

[AMD] Add Minimax tp8 with ep and update vllm image for MI355x#927
chunfangamd merged 9 commits intomainfrom
minimax-mi355-opt

Conversation

@benenzhu
Copy link
Copy Markdown
Collaborator

Add tp8 with ep for conc 32 - 256 for Minimax in mi355x.

@benenzhu benenzhu requested a review from a team March 23, 2026 10:20
@github-actions
Copy link
Copy Markdown
Contributor

Thanks for the contribution! For vLLM & SGLang, please ensure that your recipes is similar to the official vLLM recipes and/or the SGLang cookbook

If it is not, please create a PR first before we can merge your PR into the master branch. Let's ensure that the documentation is first class such that the entire ML community can benefit from your hard work! Thank you

1 similar comment
@github-actions
Copy link
Copy Markdown
Contributor

Thanks for the contribution! For vLLM & SGLang, please ensure that your recipes is similar to the official vLLM recipes and/or the SGLang cookbook

If it is not, please create a PR first before we can merge your PR into the master branch. Let's ensure that the documentation is first class such that the entire ML community can benefit from your hard work! Thank you

@benenzhu
Copy link
Copy Markdown
Collaborator Author

/sweep test-config --config-files .github/configs/amd-master.yaml --runner-config .github/configs/runners.yaml --config-keys minimaxm2.5-fp8-mi355x-vllm

@github-actions
Copy link
Copy Markdown
Contributor

@benenzhu Kicking off a sweep.

Run: https://github.com/SemiAnalysisAI/InferenceX/actions/runs/23432526160
Command: test-config --config-files .github/configs/amd-master.yaml --runner-config .github/configs/runners.yaml --config-keys minimaxm2.5-fp8-mi355x-vllm
Pinned ref: 9d2248d
Approval: not required (trusted collaborator).

Copy link
Copy Markdown
Contributor

@claude claude bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM — straightforward config and script extension; the only issue is the nit flagged inline (wrong PR link in changelog).

Extended reasoning...

Overview

This PR touches three files: the AMD benchmark config (amd-master.yaml), the MiniMax MI355X benchmark shell script, and perf-changelog.yaml. The config adds { tp: 8, ep: 8, conc-start: 32, conc-end: 256 } to all three seq-len configs for minimaxm2.5-fp8-mi355x-vllm and bumps the image from v0.15.1 to v0.18.0. The shell script adds EP_SIZE as a required env var, conditionally passes --enable-expert-parallel to vllm serve, and removes the previously-present --disable-log-requests flag (consistent with a recent standalone commit c35e700 that removed it). The changelog gets one new entry.

Security Risks

No security-sensitive code paths are touched. This is purely benchmark configuration and a shell launch script for an inference server. No auth, crypto, or permission logic is involved.

Level of Scrutiny

Low scrutiny is warranted. All three changes are mechanical and follow well-established patterns already present in the repo (the EP conditional mirrors the same pattern used in other MiniMax scripts). The image bump and new TP=8/EP=8 search-space entries are additive and consistent with peer configs (e.g., dsr1-fp4-mi355x-atom uses identical ep-based entries).

Other Factors

The only bug found is a nit: the new perf-changelog.yaml entry references /pull/868 instead of /pull/927. This is a documentation/traceability issue with no functional impact. The functional changes are correct and complete. No outstanding human reviewer comments exist in the timeline.

@cquil11
Copy link
Copy Markdown
Collaborator

cquil11 commented Mar 23, 2026

/sweep test-config --config-files .github/configs/amd-master.yaml --runner-config .github/configs/runners.yaml --config-keys minimaxm2.5-fp8-mi355x-vllm --evals-only

@github-actions
Copy link
Copy Markdown
Contributor

@cquil11 Kicking off a sweep.

Run: https://github.com/SemiAnalysisAI/InferenceX/actions/runs/23438981061
Command: test-config --config-files .github/configs/amd-master.yaml --runner-config .github/configs/runners.yaml --config-keys minimaxm2.5-fp8-mi355x-vllm --evals-only
Pinned ref: 9d2248d
Approval: not required (trusted collaborator).

@cquil11
Copy link
Copy Markdown
Collaborator

cquil11 commented Mar 23, 2026

Once accuracy check looks good then we can merge.

@benenzhu
Copy link
Copy Markdown
Collaborator Author

vllm-project/recipes#300 The vllm recipes PR for this one.

Copy link
Copy Markdown
Collaborator

@chunfangamd chunfangamd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

1K1K: +21.3%
8K1K: +14.3%

@chunfangamd chunfangamd enabled auto-merge (squash) March 23, 2026 14:16
@chunfangamd chunfangamd merged commit 0d571de into main Mar 23, 2026
13 checks passed
@chunfangamd chunfangamd deleted the minimax-mi355-opt branch March 23, 2026 14:38
@cquil11 cquil11 added the AMD label Apr 8, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

Development

Successfully merging this pull request may close these issues.

3 participants